-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add amdgpu target #134740
Add amdgpu target #134740
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @GuillaumeGomez (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
These commits modify compiler targets. This PR changes how LLVM is built. Consider updating src/bootstrap/download-ci-llvm-stamp. Some changes occurred in src/doc/rustc/src/platform-support cc @Noratrieb This PR modifies If appropriate, please update |
r? jieyouxu |
cc @eddyb Hello, tagging you for domain expertise if you want to chime in. |
Thanks for the PR, @Flakebi. I'm going to request that you open a MCP at https://github.com/rust-lang/compiler-team/issues/ to gauge team consensus for adding this target, primarily to give compiler team members some opportunity to ask clarifying questions and register possible concerns, since:
Note that usually adding more "conventional" Tier 3 targets do not need to go through the MCP process, but this target looks not so conventional. |
@rustbot author |
Thank you for the quick review! I opened an MCP here: rust-lang/compiler-team#823 |
cc @ZuseZ4 |
☔ The latest upstream changes (presumably #134822) made this pull request unmergeable. Please resolve the merge conflicts. |
Add amdgpu target Add amdgpu target to rustc and enable the LLVM target. Fix compiling `core` with the amdgpu: The amdgpu backend makes heavy use of different address spaces. This leads to situations, where a pointer in one addrspace needs to be casted to a pointer in a different addrspace. `bitcast` is invalid for this case, `addrspacecast` needs to be used. Fix compilation failures that created bitcasts for such cases by creating pointer casts (which creates an `addrspacecast` under the hood) instead. MCP: rust-lang/compiler-team#823 Tracking issue: rust-lang#135024 Kinda related to the original amdgpu tracking issue rust-lang#51575 (though that one has been closed for a while). try-job: dist-loongarch64-linux try-job: dist-loongarch64-muls try-job: dist-powerpc64-linux
A job failed! Check out the build log: (web) (plain) Click to see the possible cause of the failure (guessed by this bot)
|
ah its called musl and not muls :) |
Add amdgpu target Add amdgpu target to rustc and enable the LLVM target. Fix compiling `core` with the amdgpu: The amdgpu backend makes heavy use of different address spaces. This leads to situations, where a pointer in one addrspace needs to be casted to a pointer in a different addrspace. `bitcast` is invalid for this case, `addrspacecast` needs to be used. Fix compilation failures that created bitcasts for such cases by creating pointer casts (which creates an `addrspacecast` under the hood) instead. MCP: rust-lang/compiler-team#823 Tracking issue: rust-lang#135024 Kinda related to the original amdgpu tracking issue rust-lang#51575 (though that one has been closed for a while). try-job: dist-loongarch64-linux try-job: dist-loongarch64-musl try-job: dist-powerpc64-linux
☀️ Try build successful - checks-actions |
@bors r=workingjubilee |
💡 This pull request was already approved, no need to approve it again.
|
☀️ Test successful - checks-actions |
Finished benchmarking commit (c03c38d): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 2.1%, secondary 3.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 2.4%, secondary -2.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 777.876s -> 781.62s (0.48%) |
The |
I'm not sure that size is really a driver for more instruction counts, but it does look like the mere possibility of cross-compiling to AMDGPU is enabling more passes/logic(?) even if presumably those don't do anything on x86. Maybe an optimization opportunity for LLVM/clang? It might be unavoidable with the architecture LLVM has today though. Sampling a few cachegrind diffs: helloworld:
clap:
|
what the eff. we should probably look into this more, it’s super weird |
I remember that sometimes binary/dynamic library size increases also increased icounts due to the dynamic linker doing more work. But based on CacheGrind, it looks like LLVM is actually doing more work, seems like it maybe iterates over more passes that were enabled by the amdgpu target? |
The max-rss increases also look unexpected, and numerous enough to not be measurement noise. (Could memory allocation be in these ??? cg reports, sometimes it does this for me rather than finding jemalloc/malloc, probably some tests with local builds could be interesting with better debuginfo. That would also help with checking the cycles and wall time results, which seemingly aren’t super stable in these results.) Does this need more time to bake maybe? @Mark-Simulacrum you’ve marked this as triaged because it’s less actionable on our side than in llvm, right? |
[experiment] dont init anything except x86 What if do not init all llvm targets always? Maybe fix regression in rust-lang#134740 r? `@ghost` `@rustbot` label +S-experimental btw, here https://github.com/rust-lang/rust/blob/c182ce9cbc8c29ebc1b4559d027df545e6cdd287/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L81-L186 similar list for targets, but it missing amdgpu. Is amdgpu works without it? kick perf run please
hate to go "ooh, LLVM troubles, let's tell Nikita!" but uhhhh "weird LLVM perf" really does need the Vibe Sense of that kind of expertise, sooo cc @nikic |
Adding the amdgpu target shouldn't make any additional passes run -- additional cost from registering additional passes etc is plausible though. max-rss increasing with increasing code size is pretty common. |
Add amdgpu target to rustc and enable the LLVM target.
Fix compiling
core
with the amdgpu:The amdgpu backend makes heavy use of different address spaces. This
leads to situations, where a pointer in one addrspace needs to be casted
to a pointer in a different addrspace.
bitcast
is invalid for thiscase,
addrspacecast
needs to be used.Fix compilation failures that created bitcasts for such cases by
creating pointer casts (which creates an
addrspacecast
under the hood)instead.
MCP: rust-lang/compiler-team#823
Tracking issue: #135024
Kinda related to the original amdgpu tracking issue #51575 (though that one has been closed for a while).